Real-time Bayesian Anomaly Detection for Environmental Sensor Data
نویسندگان
چکیده
Recent advances in sensor technology are facilitating the deployment of sensors into the environment that can produce measurements at high spatial and/or temporal resolutions. Not only can these data be used to better characterize systems for improved modeling, but they can also be used to produce better understandings of the mechanisms of environmental processes. One such use of these data is anomaly detection to identify data that deviate from historical patterns. These anomalous data can be caused by sensor or data transmission errors or by infrequent system behaviors that are often of interest to the scientific or public safety communities. Thus, anomaly detection has many practical applications, such as data quality assurance and control (QA/QC), where anomalous data are treated as data errors; focused data collection, where anomalous data indicate segments of data that are of interest to researchers; and event detection, where anomalous data signal system behaviors that could result in a natural disaster. This study develops two automated anomaly detection methods that employ Dynamic Bayesian Networks (DBNs). These machine learning methods can operate on a single sensor data stream, or they can consider several data streams at once, using all of the streams concurrently to perform coupled anomaly detection. This study investigates these methods’ abilities, using both coupled and uncoupled detection, to perform QA/QC on two windspeed data streams from Corpus Christi, Texas; false positive and false negative rates serve as the basis for comparison of the methods. The results indicate that a coupled DBN anomaly detector, tracking the actual windspeeds, their measurements, and the status of these measurements, performs well at identifying erroneous data in these data streams.
منابع مشابه
Real-time Bayesian anomaly detection in streaming environmental data
[1] With large volumes of data arriving in near real time from environmental sensors, there is a need for automated detection of anomalous data caused by sensor or transmission errors or by infrequent system behaviors. This study develops and evaluates three automated anomaly detection methods using dynamic Bayesian networks (DBNs), which perform fast, incremental evaluation of data as they bec...
متن کاملAnomaly Detection and Redundancy Elimination of Big Sensor Data in Internet of Things
In the era of big data and Internet of things, massive sensor data are gathered with Internet of things. Quantity of data captured by sensor networks are considered to contain highly useful and valuable information. However, for a variety of reasons, received sensor data often appear abnormal. Therefore, effective anomaly detection methods are required to guarantee the quality of data collected...
متن کاملA Novel Ensemble Approach for Anomaly Detection in Wireless Sensor Networks Using Time-overlapped Sliding Windows
One of the most important issues concerning the sensor data in the Wireless Sensor Networks (WSNs) is the unexpected data which are acquired from the sensors. Today, there are numerous approaches for detecting anomalies in the WSNs, most of which are based on machine learning methods. In this research, we present a heuristic method based on the concept of “ensemble of classifiers” of data minin...
متن کاملProbabilistic Models for Anomaly Detection in Remote Sensor Data Streams
Remote sensors are becoming the standard for observing and recording ecological data in the field. Such sensors can record data at fine temporal resolutions, and they can operate under extreme conditions prohibitive to human access. Unfortunately, sensor data streams exhibit many kinds of errors ranging from corrupt communications to partial or total sensor failures. This means that the raw dat...
متن کاملSeparating the Wheat from the Chaff: Practical Anomaly Detection Schemes in Ecological Applications of Distributed Sensor Networks
We develop a practical, distributed algorithm to detect events, identify measurement errors, and infer missing readings in ecological applications of wireless sensor networks. To address issues of non-stationarity in environmental data streams, each sensor-processor learns statistical distributions of differences between its readings and those of its neighbors, as well as between its current an...
متن کامل